U.S. flag

An official website of the United States government

Format

Send to:

Choose Destination

ERX3267235: Illumina NovaSeq 6000 paired end sequencing
25 ILLUMINA (Illumina NovaSeq 6000) runs: 783.6M spots, 235.1G bases, 81.6Gb downloads

Submitted by: NYGC
Study: 30X whole genome sequencing coverage of the 2504 Phase 3 1000 Genome samples.
show Abstracthide Abstract
We sequenced all 2,504 samples from the 1000 Genomes (1KG) Project to a minimum of 30x mean genome coverage. Though a small number of 1KG samples had been sequenced to high coverage previously, we sequenced all samples to depth on the latest technology, providing a unified dataset for the next phase of analyses. We processed these samples using the laboratory processes we have previously used for the CCDG project (with minor modifications). Specifically, we generated PCR-free sequencing libraries using unique dual indices to avoid the index switching phenomenon that occurs and causes low level sequencing data contamination on the Illumina patterned flow cells. We sequenced these samples on the Illumina NovaSeq 6000 sequencing instrument, with 2x150bp reads. We believe this instrument represents the future for WGS with short-read technology, and it was important to sequence the 1KG samples in a format that is consistent with future large scale sequencing projects. Our automated analysis pipeline for whole genome sequencing matches the CCDG and TOPMed recommended best practices. Sequencing reads were aligned to the human reference, hs38DH, using BWA-MEM v0.7.15. Data are further processed using the GATK best-practices (v3.5), which generates VCF files in the 4.2 format. Single nucleotide variants and Indels are called using GATK HaplotypeCaller (v3.5), which generates a single-sample GVCF. Variant Quality Score Recalibration (VQSR) is performed using dbSNP138 so quality metrics for each variant can be used in downstream variant filtering.
Sample: Coriell GM20787
SAMN00001300 • SRS001748 • All experiments • All runs
Organism: Homo sapiens
Library:
Name: NA20787
Instrument: Illumina NovaSeq 6000
Strategy: WGS
Source: GENOMIC
Selection: RANDOM
Layout: PAIRED
Construction protocol: TruSeq DNA PCR-free
Runs: 25 runs, 783.6M spots, 235.1G bases, 81.6Gb
Run# of Spots# of BasesSizePublished
ERR3239860391,803,660117.5G10.7Gb2019-03-25
ERR356088238,114,49511.4G3.5Gb2019-10-02
ERR356088436,855,77011.1G3.4Gb2019-10-02
ERR356088637,756,62411.3G3.4Gb2019-10-02
ERR356088837,630,88711.3G3.4Gb2019-10-02
ERR356089030,078,5569G2.7Gb2019-10-02
ERR356089229,590,4418.9G2.6Gb2019-10-02
ERR356089429,855,2869G2.6Gb2019-10-02
ERR356089629,854,0169G2.6Gb2019-10-02
ERR356089830,411,6079.1G2.8Gb2019-10-02
ERR356090030,441,9349.1G2.8Gb2019-10-02
ERR356090230,753,7089.2G2.9Gb2019-10-02
ERR356090430,460,3369.1G2.8Gb2019-10-02
ERR4657838unavailable2020-10-07
ERR4657839unavailable2020-10-07
There are 10 omitted runs. See all runs in Run Selector.

ID:
7513655

Supplemental Content

Recent activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...